393 research outputs found

    Multiple vocabulary coding for 3D shape retrieval using Bag of Covariances

    Get PDF
    Bag of Covariance matrices (BoC) have been recently introduced as an extension of the standard Bag of Words (BoW) to the space of positive semi-definite matrices, which has a Riemannian structure. BoC descriptors can be constructed with various Riemannian metrics and using various quantization approaches. Each construction results in some quantization errors, which are often reduced by increasing the vocabulary size. This, however, results in a signature that is not compact, increasing both the storage and computation complexity. This article demonstrates that a compact signature, with minimum distortion, can be constructed by using multiple vocabulary based coding. Each vocabulary is constructed from a different quantization method of the covariance feature space. The proposed method also extracts non-linear dependencies between the different BoC signatures to compose the final compact signature. Our experiments show that the proposed approach can boost the performance of the BoC descriptors in various 3D shape classification and retrieval tasks

    Learning Faculty to Teach with an E-Learning Platform: Some Design Principles

    Get PDF
    In: A.J. Kallenberg and M.J.J.M. van de Ven (Eds), 2002, The New Educational Benefits of ICT in Higher Education: Proceedings. Rotterdam: Erasmus Plus BV, OECR ISBN 90-9016127-9The implementation of electronic learning platforms requires new competencies of faculty members. Institutions of higher education are challenged to support their staff to acquire those competencies. Training seems to be an interesting way to do so. This paper includes a brief description of a faculty development programme of the Katholieke Universiteit Leuven (Belgium). The evaluation of the programme - both by the trainers and the participants - clearly pointed at the necessity to carefully analyse the characteristics of the participants, as well as the competencies needed to integrate successfully an e-learning platform in one's teaching practice. This exercise led to the formulation of some design principles for faculty development programmes on e-learning platforms. The change of faculty members' teaching conceptions as well as the attention for their stages of concern can be pointed at as crucial elements of these principles

    The use of tongue protrusion gestures for video-based communication

    Get PDF
    We propose a system that tracks the mouth region in a video sequence and detects the occurrence of a tongue protrusion event. Assuming that the natural location of the tongue is inside the mouth, the tongue protrusion gesture is interpreted as an intentional communication sign that the user wishes to perform a task. The system operates in three steps: (1) mouth template segmentation, in which we initialize one template for the entire mouth, and one template for each of the left and right halves of the mouth; (2) mouth region tracking using the Normalized Correlation Coefficient (NCC), and (3) tongue protrusion event detection and interpretation. We regard the tongue protrusion transition as the event that begins when a minimum part of the tongue starts protruding from the mouth, and which ends when the protrusion is clearly visible. The left and right templates are compared to their corresponding halves for each new mouth image that has been tracked, and a left-NCC and a right-NCC are obtained for each part. By analyzing the NCCs during the tongue protrusion transition time, the left or right position of the protrusion, relative to the center of the mouth, is determined. We analyze our proposed communication method and demonstrate that it adapts easily to different users. The detection of this gesture can be used for instance as a dual-switch hand-free human-computer interface for granting control of a computer

    Detection of tongue protrusion gestures from videos

    Get PDF
    We propose a system that, using video information, segments the mouth region from a face image and then detects the protrusion of the tongue from inside the oral cavity. Initially, under the assumption that the mouth is closed, we detect both mouth corners. We use a set of specifically oriented Gabor filters for enhancing horizontal features corresponding to the shadow existing between the upper and lower lips. After applying the Hough line detector, the extremes of the line that was found are regarded as the mouth corners. Detection rate for mouth corner localization is 85.33%. These points are then input to a mouth appearance model which fits a mouth contour to the image. By segmenting its bounding box we obtain a mouth template. Next, considering the symmetric nature of the mouth, we divide the template into right and left halves. Thus, our system makes use of three templates. We track the mouth in the following frames using normalized correlation for mouth template matching. Changes happening in the mouth region are directly described by the correlation value, i.e., the appearance of the tongue in the surface of the mouth will cause a decrease in the correlation coefficient through time. These coefficients are used for detecting the tongue protrusion. The right and left tongue protrusion positions will be detected by analyzing similarity changes between the right and left half-mouth templates and the currently tracked ones. Detection rates under the default parameters of our system are 90.20% for the tongue protrusion regardless of the position, and 84.78% for the right and left tongue protrusion positions. Our results demonstrate the feasibility of real-time tongue protrusion detection in vision-based systems and motivates further investigating the usage of this new modality in human-computer communication

    Multi-view shape reconstruction in CAVE system

    Get PDF
    Nowadays, many applications, ranging from medicine to video games, require accurate 3D geometry and motion of human actors

    Modeling the Personal Space of Virtual Agents for Behavior Simulation

    Get PDF
    In this paper we propose a mathematical model for the concept of personal space (PS) and apply it to simulate the non-verbal communication between agents in virtual worlds. The distance between two persons reflects the type of their relationship. Human-like autonomous virtual agents should be equipped with such capability to simulate natural interactions. We define three types of relationships; (1) stranger relationship, (2) business relationship, and (3) friendly relationship. First we model the space around an agent as a probability distribution function which reflects at each point in the space the importance of that point to the agent. The agent updates dynamically this function according to(1) his relation with the other agent, (2) his face orientation, and(3) the evolution of the relationship over time as a stranger agent may become a friend. We demonstrate the concept on a multi-agent platform and show that space-aware agents exhibit better natural behavior

    Detection and analysis of wheat spikes using Convolutional Neural Networks

    Get PDF
    Background Field phenotyping by remote sensing has received increased interest in recent years with the possibility of achieving high-throughput analysis of crop fields. Along with the various technological developments, the application of machine learning methods for image analysis has enhanced the potential for quantitative assessment of a multitude of crop traits. For wheat breeding purposes, assessing the production of wheat spikes, as the grain-bearing organ, is a useful proxy measure of grain production. Thus, being able to detect and characterize spikes from images of wheat fields is an essential component in a wheat breeding pipeline for the selection of high yielding varieties. Results We have applied a deep learning approach to accurately detect, count and analyze wheat spikes for yield estimation. We have tested the approach on a set of images of wheat field trial comprising 10 varieties subjected to three fertilizer treatments. The images have been captured over one season, using high definition RGB cameras mounted on a land-based imaging platform, and viewing the wheat plots from an oblique angle. A subset of in-field images has been accurately labeled by manually annotating all the spike regions. This annotated dataset, called SPIKE, is then used to train four region-based Convolutional Neural Networks (R-CNN) which take, as input, images of wheat plots, and accurately detect and count spike regions in each plot. The CNNs also output the spike density and a classification probability for each plot. Using the same R-CNN architecture, four different models were generated based on four different datasets of training and testing images captured at various growth stages. Despite the challenging field imaging conditions, e.g., variable illumination conditions, high spike occlusion, and complex background, the four R-CNN models achieve an average detection accuracy ranging from 88 to 94% across different sets of test images. The most robust R-CNN model, which achieved the highest accuracy, is then selected to study the variation in spike production over 10 wheat varieties and three treatments. The SPIKE dataset and the trained CNN are the main contributions of this paper. Conclusion With the availability of good training datasets such us the SPIKE dataset proposed in this article, deep learning techniques can achieve high accuracy in detecting and counting spikes from complex wheat field images. The proposed robust R-CNN model, which has been trained on spike images captured during different growth stages, is optimized for application to a wider variety of field scenarios. It accurately quantifies the differences in yield produced by the 10 varieties we have studied, and their respective responses to fertilizer treatment. We have also observed that the other R-CNN models exhibit more specialized performances. The data set and the R-CNN model, which we make publicly available, have the potential to greatly benefit plant breeders by facilitating the high throughput selection of high yielding varieties

    Improving engagement of stroke survivors using desktop virtual Reality-Based serious games for upper limb rehabilitation: A multiple case study

    Get PDF
    Engagement with upper limb rehabilitation post-stroke can improve rehabilitation outcomes. Virtual Reality can be used to make rehabilitation more engaging. In this paper, we propose a multiple case study to determine: (1) whether game design principles (identified in an earlier study as being likely to engage) actually do engage, in practice, a sample of stroke survivors with a Desktop Virtual Reality-based Serious Game designed for upper limb rehabilitation; and (2) what game design factors support the existence of these principles in the game. In this study, we considered 15 principles: awareness , feedback , interactivity , flow , challenge , attention , interest , involvement , psychological absorption , motivation , effort , clear instructions , usability , purpose , and a first-person view . Four stroke survivors used, for a period of 12 weeks, a Virtual Reality-based upper limb rehabilitation system called the Neuromender Rehabilitation System. The stroke survivors were then asked how well each of the 15 principles was supported by the Neuromender Rehabilitation System and how much they felt each principle supported their engagement with the system. All the 15 tested principles had good or reasonable support from the participants as being engaging. Use of feedback was emphasised as an important design factor for supporting the design principles, but there was otherwise little agreement in important design factors among the participants. This indicates that more personalised experiences may be necessary for optimised engagement. The insight gained can be used to inform the design of a larger scale statistical study into what engages stroke survivors with Desktop Virtual Reality-based upper limb rehabilitation

    Text to image synthesis for improved image captioning

    Get PDF
    Generating textual descriptions of images has been an important topic in computer vision and natural language processing. A number of techniques based on deep learning have been proposed on this topic. These techniques use human-annotated images for training and testing the models. These models require a large number of training data to perform at their full potential. Collecting human generated images with associative captions is expensive and time-consuming. In this paper, we propose an image captioning method that uses both real and synthetic data for training and testing the model. We use a Generative Adversarial Network (GAN) based text to image generator to generate synthetic images. We use an attention-based image captioning method trained on both real and synthetic images to generate the captions. We demonstrate the results of our models using both qualitative and quantitative analysis on popularly used evaluation metrics. We show that our experimental results achieve two fold benefits of our proposed work: i) it demonstrates the effectiveness of image captioning for synthetic images, and ii) it further improves the quality of the generated captions for real images, understandably because we use additional images for training

    Weed recognition using deep learning techniques on class-imbalanced imagery

    Get PDF
    Context: Most weed species can adversely impact agricultural productivity by competing for nutrients required by high-value crops. Manual weeding is not practical for large cropping areas. Many studies have been undertaken to develop automatic weed management systems for agricultural crops. In this process, one of the major tasks is to recognise the weeds from images. However, weed recognition is a challenging task. It is because weed and crop plants can be similar in colour, texture and shape which can be exacerbated further by the imaging conditions, geographic or weather conditions when the images are recorded. Advanced machine learning techniques can be used to recognise weeds from imagery. Aims: In this paper, we have investigated five state-of-the-art deep neural networks, namely VGG16, ResNet-50, Inception-V3, Inception-ResNet-v2 and MobileNetV2, and evaluated their performance for weed recognition. Methods: We have used several experimental settings and multiple dataset combinations. In particular, we constructed a large weed-crop dataset by combining several smaller datasets, mitigating class imbalance by data augmentation, and using this dataset in benchmarking the deep neural networks. We investigated the use of transfer learning techniques by preserving the pre-trained weights for extracting the features and fine-tuning them using the images of crop and weed datasets. Key results: We found that VGG16 performed better than others on small-scale datasets, while ResNet-50 performed better than other deep networks on the large combined dataset. Conclusions: This research shows that data augmentation and fine tuning techniques improve the performance of deep learning models for classifying crop and weed images. Implications: This research evaluates the performance of several deep learning models and offers directions for using the most appropriate models as well as highlights the need for a large scale benchmark weed dataset
    • …
    corecore